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Abstract 

Background: Network biology (systems biology) approaches are useful tools for elucidating the host infection 
processes that often accompany complex immune networks. Although many studies have recently focused on 
Hoemophilus parasuis, a model of Gram-negative bacterium, little attention has been paid to the host's immune 
response to infection. In this article, we use network biology to investigate infection with Haemophilus parasuis in 
an in vivo pig model. 

Results: By targeting the spleen immunogenome, we established an expression signature indicative of/-/, parasuis 
infection using a PCA/GSEA combined method. We reconstructed the immune network and estimated the network 
topology parameters that characterize the immunogene expressions in response to /-/. parasuis infection. The results 
showed that the immune network of/-/, parasuis infection is compartmentalized (not globally linked). Statistical 
analysis revealed that the reconstructed network is scale-free but not small-world. Based on the quantitative 
topological prioritization, we inferred that the ClR-centered clique might play a vital role in responding to /-/. 
parasuis infection. 

Conclusions: Here, we provide the first report of reconstruction of the immune network in IH. parasuis-infected 
porcine spleen. The distinguishing feature of our work is the focus on utilizing the immunogenome for a network 
biology-oriented analysis. Our findings complement and extend the frontiers of knowledge of host infection 
biology for /-/. parasuis and also provide a new clue for systems infection biology of Gram-negative bacilli in 
mammals. 

Keywords: Pig model, Haemophilus parasuis, Spleen, Immunogenome, Network, Quantitative topology. Scale-free, 
GIR 



Background clinical symptoms, pathology and diagnosis, susceptibil- 

Glasser's disease is caused by Haemophilus parasuis ity and epidemiology, pathogenic biology, vaccine devel- 

(shortened as H, parasuis or HPS), a model Gram- opment, and evaluation of virulence-associated factors 

negative bacillus. This disease is an important cause of [2-6]. Many of these investigations have highly focused 

economic loss in the world's pig industry, which is on the major aspects of biology and pathogenesis for the 

clinically characterized by fibrinous polyserositis, H, parasuis bacterium itself. Aside from several recent 

polyarthritis and meningitis [1]. To date, the major focus studies [7-11], the molecular mechanisms of the pig host 

of studies of porcine Glasser's disease has centered on that are involved in the response to the H. parasuis inva- 
sion have not been well addressed. More importantly, 
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needs, pigs infected with H. parasuis could also serve as 
mammalian and human models for bacterial infectious 
diseases. 

It is well known that the immune genes (hereafter re- 
ferred to as immunogenes) have played central roles in the 
regulation of pathogen-induced host processes in vivo, in- 
cluding those of Glasser's disease. Systems biology (also re- 
ferred to as network biology) approaches have brought a 
research paradigm for infectious diseases; for example, a 
systems biology program was recently initiated by the 
National Institute of Allergy and Infectious Diseases [13]. 
Systems biology investigations of the transcriptome of host 
immunogenome could provide a profound exploration of 
the molecular events occurring, for example, the three- or 
even four-dimensional relationships between genes during 
a response to pathogen infection. This would increase our 
understanding of host resistance/susceptibility genes, im- 
mune response mechanisms, and molecular basis of host- 
pathogen interactions [14,15]. So, the systems biology 
approaches can also provide us with powerful tools for 
uncovering the molecular immune mechanisms that defend 
against H, parasuis infection. 

Custom-build gene chips have been widely applied in 
a variety of investigations [16-21]. On many occasions, a 
reduced fragment of microarray data could work more 
efficiently to reveal more subtle insights into the target 
biological phenomena than the non-reduced global 
genome data do [22-24]. As a consequence, an analysis 
focusing on immunogenes could give a more precise 
exploration of the transcriptomic landscape of infection- 
induced immune processes in hosts. 

In the body's immune system, spleen is an important 
target organ for studies of immune mechanisms. It has 
been well documented that the spleen is a crucial immune 
organ to protect the body against a variety of diseases and 
infections [25-27]. The spleen, known as the blood cleaner 
for its role in capturing foreign antigens and destroying 
old red blood cells, is made up of a variety of immune cells 
and blood cells, including B cells, T cells, macrophages, 
dendritic cells, natural killer cells and red blood cells 
[28-31]. When migratory macrophages and dendritic 
cells bring antigens to the spleen, the immune cells 
(e.g., T- and B -lymphocytes) become activated and trigger 
a series of immune responses [32-35]. Although not ob- 
ligatory for survival, it has been proven that the spleen 
plays a key role in mounting immune responses to 
antigens, and in the absence of the spleen, the body would 
be more susceptible to infections [36]. Consequently, the 
spleen is one of the ideal organ models for studying host 
immune responses to pathogenic challenges, including the 
H, parasuis infection. 

Our previous study has used the Affymetrix Porcine 
Genechip to profile the differentially expressed genes 
between spleens with and without administrations with 



the H. parasuis [10]. There were totally 931 differentially 
expressed transcripts, of which only a small fragment 
has been annotated as immunogenes. The result showed 
that the unfocused global expression profiling based on 
a full-genome array couldn't reveal the subtle roles of 
immunogenes. In the present study, we aim to clarif)^ 
the subtle roles of immunogenes in the host response to 
H, parasuis challenge. Using the pig {Sus scrofa) as an 
in vivo model, we first characterized the microarray ex- 
pression dataset of the spleen's immunogenome. Based on 
the partitioned immunogenome dataset, we performed a 
comprehensive immunomic analysis, which included re- 
construction of the immune network and evaluation of 
network parameters and quantitative topological proper- 
ties. Our investigation revealed a vital network component 
in response to H, parasuis infection. To our knowledge, 
this is the first network biology analysis of the spleen 
immunogenome upon challenge with a Gram-negative bac- 
terium in mammals. 

Results and Discussion 

Characterization of the immunogene dataset 

We used the GeneChip® Porcine Genome Array 
(Aff)^metrix) to measure gene expressions of porcine 
spleen from three normal and three H, parasuis -infected 
samples from six separate piglets. By extracting immune 
pathways from KEGG and reactome databases (see the 
Additional file 1), a total of 1,999 transcripts from the 
20,201 transcripts arrayed on the chip were targeted as 
immunogenes according to the pathway annotation 
results. The basic annotation information of Affymetrix 
probesets and corresponding transcripts of immunogenes 
is shown in the Additional file 2. Among the subsection 
of 1,999 transcripts of immunogenes, a total of 1,115 trans- 
cripts were detected to call Present in both normal and H. 
parasuiS'infected samples. Additional file 3: Table SI gives 
the descriptive statistical parameters used to evaluate the 
expression of the signals of the immunogenes on at least 
one chip using the mas5calls method. 

The dataset of immunogenes partitioned from the 
Affymetrix Porcine Genome Genechip was evaluated by 
exploratory multivariate analysis. First, the principal 
components analysis (PCA) revealed that a total of 6 
principal components were detected, and their stan- 
dard deviations were estimated to be 2.279, 0.710, 
0.427, 0.248, 0.183 and 0.155, respectively. With the cu- 
mulative proportion up to 0.950, the major variation of 
dataset could be explained by the first two principal 
components. A 3D plot for the 6 chips (samples) under 
the coordinates of the first three principal components 
is given in Figure lA. The between-chip joint densities of 
immunogenes were also evaluated, and Figure IB gave an 
example for a 3D plot of the joint density between the 
samples 2 and 6. The symmetry of 3D plot of the between- 
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chip joint densities can be used to reveal the technical 
stableness of microarray experiments. Here, all of the 3D 
plots had a symmetrical appearance, which means that our 
microarray experiments were technically reliable. Based on 
the Euclidean distance. Figure IC also provides the 3D tree 
map for the hierarchical clustering of immunogenes under 
the first principal component [37]. 

In particular, we found that the second principal compo- 
nent is a factor to classify the control and infected samples, 
of which the loading coefficients differed in their positive 
and negative signs (i.e., -0.205, -0.229 and -0.211 vs. 0.593, 
0.094 and 0.066) (see the y-axis in Figure lA). It means that 
the immunogenes loaded on the second principal compo- 
nent can service as covariate classifier or genomic signature 
that could distinguish the samples. In the first 20 genes 
with the largest positive loadings on the second axis (see 



Additional file 3: Table S2), except for PMM2, all members 
were also identified to be differentially expressed (see 
Figure 2B and 2C and section 2 of the Results and Discus- 
sion). This result indicates that the PCA analysis might 
provide an alternative or potential solution for identification 
of differentially expressed genes. To evaluate the reliability 
of the result, GeneSet Enrichment Analysis (GSEA) was 
further conducted. Figure ID presents the enrichment plot 
for the gene set, and the detailed results of GSEA analysis 
are given in the Additional file 4. In this analysis, the En- 
richment Score (ES), Normalized Enrichment Score (NES), 
Nominal p-value. False Discovery Rate (FDR) q-value and 
Family- Wise Error Rate (FWER) p-Value are estimated to 
be -0.938, -1.286, 0.00, 0.119 and 0.00, respectively. The 
result of the new PCA/GSEA combined method shows that 
a core set of 16 genes are increasingly expressed in the 
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Figure 1 Exploratory multivariate evaluation of the immunogene dataset. (A) 3D plot of samples under the coordinates of the first three 
principal components. (B) Evaluation of joint densities: an example for the 3D plot of joint density between samples 2 and 6. (C) Hierarchical 
clustering on the factor map for immunogenes: 3D dendrogram of the first principal component. Here, the optimal level of division calculated 
suggests three clusters. (D) Enrichment plot for the gene set with the largest positive loadings on the second principal component. Profile of the 
running ES score and positions of GeneSet members on the rank-ordered list. 
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Figure 2 Evaluation plots for differentially expressed immunogenes. (A) Radial plots oft values, p values and logFC values for the 
differential expression test. The blue, red and green lines indicate t values, logFC values and p values, respectively. (B) Two-way hierarchical 
clustering of differentially expressed immunogenes. Rows represent genes and columns represent samples. Ctr-1, Ctr-2 and Ctr-3 were the control 
samples, and HPS-1, HPS-2 and HPS-3 were the Haemophilus paras u/s-infected samples. (C) Comparisons of expression values of differentially 
expressed immunogenes between samples. The normalized MASS expression values were used here. 



infected samples and could be used as an expression 
signature indicative of H. parasuis infection (see Additional 
file 4). 

Identification and GO annotation of differentially 
expressed immunogenes 

One of the routine goals for transcriptomic analysis is to 
identify differences in the expression between phenotypic 
covariates of samples. To improve the detection power of 
the differential expression test [23,24], the inter-quartile 
range (IQR) filter was used to remove those uninformative 
genes. The differentially expressed immunogenes between 
the control and H. parasuis-infected groups were identi- 
fied by empirical Bayes correction of the linear model 
[38], in which the cutoffs of p-value and logFC (log2-fold- 
change) were set as 0.05 and 1, respectively. The logFC, 



AveExpr (average log2-expression), t-statistic, p-value, 
adjusted p-value (q-value) and B-value (log odds value) for 
each gene can be found in the Additional file 5. 

According to the cutoff criteria, a total of 36 immuno- 
genes were detected to be differentially expressed. The 
estimates of t-value, p-value and logFC for all IQR-filtered 
immunogenes are given in Figure 2A. The hierarchical 
clustering presented in Figure 2B gives a distinct 
occurrence of differential expression pattern of these 36 
immunogenes in which, compared to the control group, 
there were 9 down- regulated genes and 27 up-regulated 
genes (see Additional file 3: Table S2), respectively. The 
visual distinction between the up- and down-regulated 
immunogenes of the two-way hierarchical clustering pat- 
tern supports the reliability of the results of the differential 
expression test. In addition, a comparison of expression 
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values of differentially expressed immunogenes between 
samples is displayed in Figure 2C, from which one can ob- 
serve that the H. parasuis infection has mainly resulted in 
increased activation of immunogenes. 

Before Gene Ontology (GO) annotations, all of the 
immunogenes were converted into human gene symbols 
because of the relative scarcity of pig gene resources in 
the public databases. The annotation contents comprised 
the three major GO categories (Cellular Compartment 
or "CC", Molecular Function or "MF", and Biological 
Process or "BP"). As depicted in Figure 3, a graphical pres- 
entation of the GO annotations shows that all of the 
presented nomenclatures for BP, MF, and CC are graphic- 
ally organized in systematic, hierarchical structures. If with- 
out considering the inclusive relationships of the upper 
hierarchical layers with the lower hierarchical ones, the 36 
differentially expressed genes were associated with 40 bio- 
logical processes. These processes mainly involve the 



immune processes, e.g., the immune system process, the 
leukocyte activation, the inflammatory response, the re- 
sponse to bacterium, and the immune system development. 
From the MF annotation results, there were 21 molecular 
functions identified. The majority of the 36 differentially 
expressed genes were linked to 2 classifications at the top 
hierarchical layer of nomenclatures of molecular functions, 
which include binding (34 genes involved) and molecular 
transducer activity (12 genes involved). The result of CC 
annotations revealed 11 cellular components, in which 
most of the immunogenes were preferentially assigned to 
the branch of plasma membrane. 

Reconstruction, visualization, and statistical evaluation of 
the immunogene network 

Based on mutual information between immunogenes, 
the C3NET algorithm was used to estimate the adja- 
cency matrix of the gene network [39]. Here, all of the 
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Figure 3 GO annotations for differentially expressed immunogenes. In the hierarchically organized graphics, the target annotations 
are marked in red, which include GO BP/MF/CC names, numbers of genes and adjusted p values. (A) Results of the Biological Process 
annotations. A total of 40 biological processes are involved. The lower histogram is a bar plot of the Biological Process annotations, including 
Biological Process names and the number of genes. (B) Results of the Molecular Function annotations of 21 molecular functions. The bar plot 
displays Molecular Function names and the number of genes. (C) Results of the Cellular Component annotations, in which 1 1 cellular 
components are displayed. The bar plot includes Cellular Component names and the number of genes. 
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immunogenes that passed the IQR filter were used to 
make the network inference. There are two reasons why 
we used the IQR-passed genes rather than the differentially 
expressed genes for network inference: 1) the number of 
differentially expressed immunogenes was relatively small, 
and, more importantly, the differential expression revealed 
the dimension-reduced or projected relationships between 
genes on the phenotypic covariate axis, which was not 
equivalent to the context of high-dimensional relation- 
ships of members in the network; and 2) it is accepted 
that not all members from a network will be simultaneously 
detected to be differentially expressed in real biological 
samples, and thus a network reconstructed only by 



differentially expressed genes was inadequate to depict the 
real topological architecture. 

Based on this modified routine, therefore, the planar 
and three-dimensional graphical representations of the 
network components are given in Figures 4A and 4B, 
and we can see that the network has incoherent topo- 
logical characteristics where there are 35 total clusters 
(cliques) involved. The largest cluster, centered on CIR, is 
made up of 16 members and the clusters with their hub's 
degrees equal to or above 3 covers approximately 60% 
immunogenes. In addition to the known immunogenes 
involved in bacterial infections (e.g., S100A8 and IL4R) 
[40], most of them are the previously unreported members 




R 2 = 0.9285 
Pr{>|t|) of intercept < 8.6e-12 
Pr(>|t|)of slope<2e-16 



25 



Figure 4 Visualization and statistical evaluation of the compartmentalized network diagram. (A) 2D plot for network topological structure. 
The white vertices are the down regulated, differentially expressed immunogenes, and the grey vertices are the upregulated genes. (B) 3D plot for 
network topological structure. In this sub-scenario, balls denote vertices (immunogenes), and lines denote edges (putative regulations) between 
two immunogenes. (C) Test of the scale-free property of the immunogene network. The logarithmic forms of degrees and their probabilities 
were linearly fitted. The red line represents the best fit. (D) Test of the average path length of immunogene network. The histogram is the 
distribution of the average path length of random graphs from 10000-time simulations. The vertical broken red line locates at x= 1.91 1, and it 
represents the estimation of the average path length of the reconstructed immunogene network. 
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participating in the H. pamsuisAnduced network. Having 
obtained the topological structure of the inferred net- 
work, we further estimated the quantitative topological 
parameters. For a global quantitative evaluation of the 
reconstructed network, Table 1 gives the majority of 
topological parameters, which includes network density, 
diameter, node degree, average path length, hub score, 
betweenness, and closeness. Among these network cliques, 
the ClR-centered clique has the highest node degree, 
betweenness score and closeness. In general, some topo- 
logical parameters of gene network can be endued with 
concrete biological meaning. For example, the node degree 
reveals the linking density (or regulatory intensity) between 
genes, and according to the principle of information theory, 
the betweenness score could be interpreted as the genes 
power to control the network. The closeness, one measure 
of network centralities, has the similar biological meaning 
of the betweenness score. Given the results of network 
parameters, the ClR-centered clique could be considered 
as the most important component of the H. parasuis- 
induced network. 

The holistic properties of the network topology were 
further evaluated. The logarithmic forms of degrees and 
their probabilities (proportions) are graphically plotted 
in Figure 4C, two of which obviously represent a linear 
relationship. This means that the degree distribution sat- 
isfies a power law. If the degree distribution follows a 
power law, we say that the network is scale-free [41]. So, 
it can conclude that the network we reconstructed in 
this study is scale-free. Figure 4D displays the distribution 
of average path lengths of random networks (10000-time 
simulations based on the Erdos-Renyi model [42]), in which 
the vertical broken red line locates the estimation of aver- 
age path length of the H, parasuis infection network. The 
location of the vertical broken red line indicates that 
the reconstructed network has a much smaller average path 
length than those of simulated random network. In 
addition, the clustering coefficient of the H, parasuis infec- 
tion network is estimated to be zero, which is also much 
smaller than the average value of those from the randomly 



Table 1 Topological parameters for the immunogene 
network 



Parameter 


Estimate 


Parameter 


Estimate 


Density 


0.009 


Diameter 


16.118 


Betweenness score (edge)* 


6.731 


Degree* 


1.567 


Betweenness score 
(vertex)* 


2.515 


Vertex closeness* 


0.006 


Average path length 


1.911 


Burt's constraint 
score* 


0.863 


Graph strength 


8.368 


Modularity value* 


0.619 


Dice similarity coefficient* 


0.022 


Kleinberg's hub 
score 


13.090 



* denotes the mean of estimates. 



simulated networks. According to the complex network 
theory, the estimation results of average path length and 
clustering coefficient suggest that the immune network we 
reconstructed is not small-world. 

There are two possibilities to explain these results. One is 
that the nodes in the reconstructed network did not cover 
the list of all potential immunogenes, and the absence of 
absent members decreases the global connectivity of the 
network. The other is that not all real biological networks 
are always small-world, and the network of H, parasuis in- 
fection might be strongly segmented or compartmentalized. 
Regarding the latter possibility, similar to Mycobacterium 
bovis bacillus Calmette-Guerin [43], the H, parasuis infec- 
tion might initiate independent signalling cascades of 
various immune regulatory pathways that lead to a sparse- 
splitting immunogene network. Although many biological 
networks have been proven to belong to the small-world 
category, there have also been studies to support the 
second possibility. These include the long-range inter- 
action networks in protein structure [44], the metabolic 
network of E. coli as defined by atomic mappings 
[45], the KSHV PPI network [46], the global network 
of Avian Influenza outbreaks [47], the sequence-based 
chemoinformatics threshold networks for drug target [48], 
and the network for phenolic secondary metabolism of 
T. cacao [49]. Therefore, the small-world property might 
be typical of networks, but might not be true for all real 
biological networks. 

Detection of the infection-induced network components 

Here, by combining gene network ideas with differential ex- 
pression, the network components involving differentially 
expressed immunogenes are considered to participate in, or 
at least be tightly associated with, the biological process 
of H, parasuis infection. Through mapping the mem- 
bers of differentially expressed immunogenes into the 
reconstructed network, we found that there were seventeen 
hub genes (of which the degree is defined here to be not 
lower than three), in which nine hubs (i.e., CIR, ADM, 
ARG2, BCL6, CD46, CD3E, CD163, CDID, and LYZ) were 
involved in differentially expressed immunogenes. Although 
they were identified as being involved in the infection 
process of H, parasuis^ most of the hub immunogenes 
themselves had no significant expression changes in the 
differential expression test. There were only two mem- 
bers (i.e., CDID and CD 163) that were identified to be dif- 
ferentially expressed. As can be seen in Figure 4A, when 
challenged with H, parasuis ^ CDID was down-regulated, 
coupling with down-regulations of CD3D and PSEN2 and 
up-regulation of TNFRSFIB. Despite up-regulation of 
CD 163, its linked neighbours were not detected to have sig- 
nificant changes in statistics. Furthermore, there were a 
total of 16 network clusters that had included the differen- 
tially expressed genes, and the network clusters mediated 
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by CIR, CD ID and BCL6 were involved in at least four dif- 
ferentially expressed genes (see Figure 4A). In these 
clusters, 6 clusters involved down-regulated genes and 13 
involved up-regulated ones. This means that there are three 
clusters being involved in both down-regulated and up- 
regulated genes. 

Use of expression data for prioritization of net- 
work components is one of the extended applications 
of gene network analysis. The goals of network-based 
prioritization approaches often involve ranking the im- 
portance of genes or of network components. According 
to the statistics of graph theory, one of the prioritization 
approaches is to rank the betweenness scores (one type 
of measures of centrality) of vertices and edges [50]. 
Following this principle, we found that CIR had the lar- 
gest betweenness score, and there were 12 vertices with 
the betweenness scores more than 10 (see Table 2). 
From the results, it was found that the importance of 
genes ranked by the betweenness scores correlated with, 
but did not exactly match, the ranks of vertex degrees. 
There is also other documented evidence consistent 
with our finding that the vertex degree was not always 
reliable for determining the importance of network 
components [51]. Moreover, in the first 19 genes with 
larger betweenness scores, there were only two members 
belonging to differentially expressed genes. That is, the 



Table 2 Rank of the betweenness scores of edges and 
vertices 



Edges 


Betweenness 
score 


Vertices 


Betweenness 
score 


CP --C1R 


39 


CIR 


102 


FCN2 -- CFH 


28 


ARG2 


36 


S100A8 -- ADM 


18 


ADAM 17 


34 


CAMP-- ADAM 17 


16 


CFB 


26 


CCRL2 - ADAM 17 


16 


ADM 


25 


CEBPB-CIR 


15 


AHSA2 


21 


CEBPD-CIR 


15 


ST6GAL1 


14 


CFB -CIR 


15 


PAKl 


12 


CXCL2 -CIR 


15 


BCL6 


10 


FCNl - CIR 


15 


CD163 


10 


111 ORB -CIR 


15 


CDID 


10 


IL4R -CIR 


15 


IFITl 


10 


ITGA5 - CIR 


15 


CCR1 


9 


PTPNl - CIR 


15 


ATXN2 


8 


S100A9 -CIR 


15 


CAMP 


8 


THBSl - CIR 


15 


CXADR 


8 


CASPl - C5AR1 


15 


LYZ 


8 


TNFRSF17- THBD 


15 


CDIA 


7 


CD3D- CDID 


14 


CD46 


6 



ranking order of betweenness scores was almost completely 
uncorrected with the results of differential expression test, 
and the differentially expressed tests only provide a low ac- 
curacy for the prioritization of genes. 

In the reconstructed network, there were 19 edges 
with betweenness scores greater than 10. According to 
the estimates of betweenness scores, the importance of 
the edge between CP and CIR was found to be much 
higher than others. Except for the edge between CIR 
and CD55 (only estimated as 1), the betweenness scores 
of all edges linked to CIR were found to be equal to or 
greater than 15. More importantly, the ClR-mediated 
network cluster was also involved in the largest number 
of differentially expressed immunogenes. It is known 
that CIR is one of the early complement proteases, 
which plays a crucial role in immune responses against 
microbial pathogens. Based on the network-based 
prioritization, despite not being differentially expressed 
gene, CIR and its co-expressed genes are considered to 
be the most important network components associated 
with H. parasuis challenge. In our opinion, both CIR 
and its co-expressed members might play key roles in 
the coordination of host defense mechanisms against the 
H, parasuis infection. 

Evaluation of immunogenes with the higher betweenness 
scores (at least 10) and bioinformatic validation of the 
CIR-centered clique 

To confirm the in vivo expressions of the immunogenes 
with the betweenness score equal to or more than 10, a 
total of 12 genes were validated by quantitative real-time 
PCR (qPCR). The primer sequence, melting temperature 
and product sizes for 12 immunogenes in the qPCR ana- 
lysis are shown in Table 3. The detailed results for each 
gene are listed in the Additional file 6. The results 
for a panel of these 12 genes show that, with the ex- 
ception of the down-regulated CDID and PAK2, most 
of them were up-regulated in infected samples (see 
Figure 5), which is in agreement with the microarray 
analytical results. 

Due to the mathematical essence of a static network 
based on mutual information, a network clique simply 
defines a functionally related gene set, which means that a 
clique is essentially denotes the extent of functional coup- 
ling expressions of a gene set. Given this, we focused on the 
most important clique and conducted a pathway enrich- 
ment analysis (PEA) to detect the possible existence of dir- 
ect or indirect regulation among the members appearing in 
the CIR-centered clique. Here, the web tool Gene Set 
Analysis Toolkit V2 was used to perform pathways enrich- 
ment analysis [52,53], and the parameters of enrichment 
analyses were set as follows: the statistical test used 
the hypergeometric method, the multiple test adjust- 
ment used the Benjamini and Hochberg method, the 
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Table 3 Primer sequences, melting temperatures and product sizes for twelve immunogenes in the qPCR analysis 



Gene symbol 



Primer sequence (5'-3') 



Target size (bp) 



Tm rC) 



ADAM 17 


Sense: TGGCAGACAACATCGTGGG 
Anti-sense: GGGCTOATGATGCGAACG 


106 


59 


ARG2 


Sense: GCATOCGCCCCGAAG 
Anti-sense:CCACTGAGCGAGGATOACT 


233 


65 


C1R 


Sense: TGTGGACCTGGATGAATGT 
Anti-sense: AATAGCCGCCGATGTAG^ 


101 


59 


CFB 


Sense: CATOATGATATOGGGACCT 
Anti-sense: GCCCAACCCCAAACACGTA 


92 


59 


CDID 


Sense: GTGGCTCCTCTACGACATCT 
Anti-sense: TTCAGGCJTCACJTGCYTCJC 


88 


55 


BCL6 


Sense: TOTCATAGTGGTGAGCCGAGAG 
Anti-sense: TGAAGTCCAGGAGGATGCAGAAC 


168 


65 


ADM 


Sense: CGCTACCGCCAGAGTATGAACA 
Anti-sense: CGTCCTOTCmGTCCGTGAAC 


121 


59 


IFIT1 


Sense: AATOGCCACAGGTCA^ 
Anti-sense: ATOCATACACAACACTCT 


144 


55 


ASHA2 


Sense: AATCTCCTGCTATATOGAAG 
Anti-sense: AGTOCTACATCTCCA^ 


126 


55 


PAK1 


Sense: CACTOCTAmCTCCAACT 
Anti-sense: CTO1TCTOTGC1TCTCA 


113 


59 


ST6GAL1 


Sense: GACATCATOAAGAAATCTC 
Anti-sense: AAGAAOTCTGGTAGTAGTA 


164 


59 


RPL32 ' 


Sense: CGGAAGmCTGGTACACAATGTAA 
Anti-sense: TGGAAGAGACGTOTGAGCAA 


94 


55-65 



' The transcript levels of RPL13 were used for sample normalization. 



significance level was set to 0.05, and the minimum 
number of genes for a category was set to 2. In terms 
of option selection, the pathway database resources 
included the Pathway Commons, Wikipathways, and 
KEGG pathway databases. 

The results of pathway enrichment analysis are listed in 
Table 4, which identifies genes in the ClR-centered clique 
that were significantly enriched in more than 30 signaling 
pathways. The existence of coupling relationships between 
these immunogenes is primarily supported by the bioinfor- 
matic analysis, and there are surely potential direct or indir- 
ect regulations between the members of the ClR-centered 
clique. In particular, by combining Tables 2 and 4, the expli- 
cit graphical connection of "CFB ~ CIR" was found to be 
enriched in several known pathways, which include the ini- 
tial triggering of complement (DB_ID:505), the comple- 
ment cascade (DB_ID:503), and the innate immunity 
signaling (DB_ID:804) in Pathway Commons database, 
and the complement and coagulation cascades (04610) in 
the KEGG pathway database. The PEA analysis primarily 
provides a digital validation of the reliability of the CIR- 
centered clique. 



Conclusions 

Recently, many studies have focused on the H. parasuis, 
a model Gram-negative bacterium. However, among 
these studies, none have paid attention to the host im- 
mune network and its quantitative topology. In this 
work, by targeting the spleen immunogenome, we have 
reconstructed the immune network and probed the 
network topology parameters that characterize the 
immunogenome-wide expression behaviors in response 
to H, parasuis infection. Our analyses suggest that 
the reconstructed immune network is scale-free but not 
small-world. To our knowledge, we report the first in- 
vestigation into the immunogenome-focused network 
biology analysis of H, parasuis infection. Compared with 
our previous investigation [10], the immunogenome- 
focused study has mined much new information about 
the host infection biology of Gram-negative bacterium 
H, parasuis. Although the number of replicates only met 
the basic requirements for sample size of microarray 
studies, the results showed that the immunogenome- 
focused strategy we used has worked efficiently. In 
addition, our results are valuable and may have potential 
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ARG2 BC:L6 CIK ST6GALI CVE IFITI AHSA2 ADAM 17 CD ID PAKI 



Figure 5 qPCR evaluation of the twelve immunogenes with betweenness scores greater than 10 in the control and HPS-infected 
samples. ARG2, BCL6, CIR, ST6GAL1, CFB, IFITI, ASHA2, ADAM17, ADM and GDI 63 were up-regulated in HPS-infected samples (red bars) 
compared with normal controls (blue bars), and GDID and PAKI were down-regulated. The vertical axis denotes the fold change calculated 
based on the mean intensity value from each individual within each group using the comparative Gt method. The expression differences were 
analyzed with T-tests. Asterisks (*) above the bars represent that the expression differences of the genes between the control and infected 
groups are significant at the level of p < 0.05. 



applications, for instance, our results might provide new 
or potential targets for interrupting or alleviating the 
course of bacterial infections. 

In summary, we used network biology approaches to 
quantitatively characterize the nature of immune network 
responding to H. parasuis infection. Our results for the first 
time revealed an immunogenome-focused network of por- 
cine spleen challenged with H, parasuis, which also provide 
a step toward a network biology-based understanding of in- 
fection with the Gram-negative bacilli in mammals. 

Methods 

Data source 

The basic raw data of six Afiymetrix chips used for the ex- 
traction of immunogenome data came from our previous 
study [10], in which the spleen tissues of three HPS infected 
piglets and three controls were individually used for the ex- 
periment. The Affymetrix chip data has been deposited in 
the NCBI Gene Expression Omnibus (GEO) database 
under the GSE series accession number GSE11787. 

Web resources and tools 

The web resources and tools used in this study mainly in- 
clude the Affymetrix technical files (http://www.affymetrix. 
com/support/index.affx); the R packages (http://www.r- 
project.org/); the Bioconductor packages (http://www. 
bioconductor.org); the annotation tools of WebGestalt 
(http://bioinfo.vanderbilt.edu/webgestalt/); the KEGG 
database (www.genome.jp/kegg/); the reactome database 



(www.reactome.org/); the GSEA analysis tool (http:// 
www.broadinstitute.org/gsea/downloads.jsp); the igraph R 
package (http://cneurocvs.rmki.kfld.hu/igraph/); and two- 
way hierarchical clustering (http://faculty.ucr.edu/~tgirke/ 
Documents/R_BioCond/My_R_Scripts/my.colorFct.R). 

Statistical analyses 

R/Bioconductor is open-source, freely available, and widely 
used for high-throughput data analyses in a variety of bio- 
logical fields. In this study, the R/Bioconductor packages 
and self-written procedures in R statistical environment 
(available upon request: zhumengjin@mail.hzau.edu.cn) 
were used to perform the statistical analyses. The affy pack- 
age was used to perform the low-level processing processes 
that included quality control, background correction, PM 
correction, summarization, normalization and probeset 
filtering. The non-specific filtering, principal components 
analysis and differentially expressed test were mainly 
realized by the packages of genefilter, FactoMineR, and 
limma. The javaGSEA Desktop Application tool was used 
for GSEA analysis. The packages used for graphical 
representations mainly included graphics, stats, MASS, 
miscSd, plotrix and RColorBrewer. The packages of c3net 
and igraph, and self-written procedures, were used for net- 
work reconstruction, including estimations of mutual infor- 
mation, adjacency matrix, network parameters and network 
topological properties, and graphical representations. All of 
these packages are free and can be downloaded from the 
websites: www.bioconductor.org and cran.r-project.org. 
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Table 4 Bioinformatic analysis to identify the existence of potential regulations in the C1R-mediated cluster through 
pathway enrichment analysis 



Pathway names 



Enriched genes in pathway 



P value 



Adjust P value 



1. Pathway Commons database 

Inflammatory Response Pathway 

Focal Adhesion 

TGF Beta Signaling Pathway 

Senescence and Autophagy 

Adipogenesis 

Initial triggering of complement 

Complement cascade 

Syndecan-4-mediated signaling events 

IL6-mediated signaling events 

Exocytosis of Alpha granule 

Platelet degranulation 

IL4-mediated signaling events 

Response to elevated platelet cytosolic Ca^^ 

Innate Immunity Signaling 

Platelet Activation 

Regulation of p38-alpha and p38-beta 

Proteogylcan syndecan-mediated signaling events 

p38 MARK signaling pathway 

Plasma membrane estrogen receptor signaling 

BMP receptor signaling 

Regulation of nuclear SMAD2/3 signaling 

TGF-beta receptor signaling 

Regulation of cytoplasmic and nuclear SMAD2/3 signaling 
Glypican 1 network 
Glypican pathway 

2. Wikipathways database 

Inflammatory Response Pathway 

Focal Adhesion 

TGF Beta Signaling Pathway 

Senescence and Autophagy 

Adipogenesis 

3. KEGG database 
ECM-receptor interaction 
Focal adhesion 

Cytokine-cytokine receptor interaction 
Complement and coagulation cascades 
Hematopoietic cell lineage 
Jak-STAT signaling pathway 



IL4R, THBSl 
ITGA5, SPPl, THBSl 
SPPl, THBSl 
CEBPB, THBSl 
CEBPB, CEBPD 
CIR, CFB 
CIR, CFB 
ITGA5, THBSl 
CEBPB, CEBPD 
ITGA5, THBSl 
ITGA5, THBSl 
CEBPB, IL4R 
ITGA5, THBSl 
CIR, CFB 
ITGA5, THBSl 
CEBPB, PTPNl 
ITGA5, THBSl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 
CEBPB, PTPNl 

IL4R, THBSl 
ITGA5, SPPl, THBSl 
SPPl, THBSl 
CEBPB, THBSl 
CEBPB, CEBPD 

ITGA5, SPPl, THBSl 
ITGA5, SPPl, THBSl 
IL4R, CXCL2, 111 ORB 
CIR, CFB 
ITGA5, IL4R 
IL4R, 111 ORB 



4.42e-05 

2.93e-05 

0.0001 

0.0002 

0.0009 

6.60e-05 

6.60e-05 

0.0003 

0.0003 

0.0003 

0.0004 

0.0005 

0.0005 

0.0007 

0.0008 

0.0028 

0.0035 

0.0037 

0.0048 

0.0053 

0.0101 

0.0101 

0.0101 

0.0224 

0.0258 

4.42e-05 

2.93e-05 

0.0001 

0.0002 

0.0009 

2.75e-06 

3.76e-05 

8.72e-05 

0.0002 

0.0004 

0.0012 



0.0001 
0.0001 
0.0002 
0.0003 
0.0009 
0.0006 
0.0006 
0.0019 
0.0019 
0.0019 
0.0024 
0.0026 
0.0026 
0.0030 
0.0033 
0.0104 
0.0124 
0.0125 
0.0150 
0.0159 
0.0246 
0.0246 
0.0246 
0.0380 
0.0386 

0.0001 
0.0001 
0.0002 
0.0003 
0.0009 

1 .65e-05 

0.0001 

0.0002 

0.0003 

0.0005 

0.0012 
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Quantitative real-time PGR 

Two-step quantitative RT-PCR (qPCR) was performed 
on the same spleen RNA samples used for the micro- 
array experiments. Total RNA were treated with DNasel 
(Tubo kit, Ambion) and reverse transcribed using the 
RevertAid™ First Strand cDNA Synthesis Kit (Fermentas) 
according to the manufacturer's instructions. We used 
the ribosomal protein L32 (RPL32) gene as an in- 
ternal control. qPCR was run on the LightCycler® 
480 Real-Time PGR System (Roche), in which the 
SYBR® Green Real-time PGR Master Mix (TOYOBO 
GO., LTD, Japan) was used as the readout. The cyc- 
ling parameters were as follows: 95x°G, 2 min; 95°G, 
15 s; X°G as appropriate, 15 s, where X is 55, 59 or 
65°G depending on the primer pair used; and 72°G, 
20 s for 45 cycles. After PGR, a melt-curve analysis 
of each primer pair was carried out to verify the spe- 
cificity of the PGR assay. The correct fragment sizes 
of the PGR products were confirmed using agarose 
gel electrophoresis (2%). Each primer set amplified a 
single product as indicated by a single peak present 
for each gene during melting curve analyses. The relative 
quantitative gene expression level was evaluated using the 
comparative Gt method. The AGt values were calculated 
by subtracting the RPL32 Gt value for each sample from 
the target Gt value of that sample. The duplicates for each 
sample were averaged, and pairwise t tests were conducted 
to determine differential expression between control and 
infection. In all qPGR analyses, the significance level was 
set £itp< 0.05. 

Additional files 
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